18 research outputs found

    Automatic identification of rhetorical relations among intra-sentence discourse segments in Arabic

    Get PDF
    © 2019 Inderscience Enterprises Ltd. Identifying discourse relations, whether implicit or explicit, has seen renewed interest and remains an open challenge. We present the first model that automatically identifies both explicit and implicit rhetorical relations among intra-sentence discourse segments in Arabic text. We build a large discourse annotated corpora following the rhetorical structure theory framework. Our list of rhetorical relations is organised into three level hierarchies of 23 fine-grained relations, grouped into seven classes. To automatically learn these relations, we evaluate and reuse features from literature, and contribute three additional features: accusative of purpose, specific connectives and the number of antonym words. We perform experiments on identifying fine-grained and coarse-grained relations. The results show that compared with all the baselines, our model achieves the best performance in most cases, with an accuracy of 91.05%

    Combining RSS-SVM with genetic algorithm for Arabic opinions analysis

    Get PDF
    Copyright © 2019 Inderscience Enterprises Ltd. Due to the large-scale users of the Arabic language, researchers are drawn to the Arabic sentiment analysis and precisely the classification areas. Thus, the most accurate classification technique used in this area is the support vector machine (SVM) classifier. This last, is able to increase the rates in opinion mining but with use of very small number of features. Hence, reducing feature’s vector can alternate the system performance by deleting some pertinent ones. To overcome these two constraints, our idea is to use random sub space (RSS) algorithm to generate several features vectors with limited size; and to replace the decision tree base classifier of RSS with SVM. Later, another proposition was implemented in order to enhance the previous algorithm by using the genetic algorithm as subset features generator based on correlation criteria to eliminate the random choice used by RSS and to prevent the use of incoherent features subsets

    Multi source retinal fundus image classification using convolution neural networks fusion and Gabor-based texture representation

    Get PDF
    Glaucoma is one of the most known irreversible chronic eye disease that leads to permanent blindness but its earlier diagnosis can be treated. Convolutional neural networks (CNNs), a branch of deep learning, have an impressive record for applications in image analysis and interpretation, including medical imaging. This necessity is justified by their capacity and adaptability to extract pertinent features automatically from the original image. In other hand, the use of ensemble learning algorithms has an important impact to improve the classification rate. In this paper, a two-stage-based image processing and ensemble learning approach is proposed for automated glaucoma diagnosis. In the first stage, the generation of different modalities from original images is adopted by the application of advanced image processing techniques especially Gabor filter-based texture image. Next, each dataset constructing from the corresponding modality will be learned by an individual CNN classifier. Aggregation techniques will be then applied to generate the final decision taking into account the outputs of all CNNs classifiers. Experiments were carried out on Rime-One dataset for glaucoma diagnosis. The obtained results proved the superiority of the proposed ensemble learning system compared to the existing studies with classification accuracy of 89.63%

    Authors\u27 Writing Styles Based Authorship Identification System Using the Text Representation Vector

    Get PDF
    © 2019 IEEE. Text mining is one of the main and typical tasks of machine learning (ML). Authorship identification (AI) is a standard research subject in text mining and natural language processing (NLP) that has undergone a remarkable evolution these last years. We need to identify/determine the actual author of anonymous texts given on the basis of a set of writing samples. Standard text classification often focuses on many handcrafted features such as dictionaries, knowledge bases, and different stylometric characteristics, which often leads to remarkable dimensionality. Unlike traditional approaches, this paper suggests an authorship identification approach based on automatic feature engineering using word2vec word embeddings, taking into account each author\u27s writing style. This system includes two learning phases, the first stage aims to generate the semantic representation of each author by using word2vec to learn and extract the most relevant characteristics of the raw document. The second stage is to apply the multilayer perceptron (MLP) classifier to fix the classification rules using the backpropagation learning algorithm. Experiments show that MLP classifier with word2vec model earns an accuracy of 95.83% for an English corpus, suggesting that the word2vec word embedding model can evidently enhance the identification accuracy compared to other classical models such as n-gram frequencies and bag of words

    A legal study on right of civil servants under the contractual based employment / Harold Emparak Kerebo … [et al.]

    Get PDF
    The purpose of this study are to highlight and examine problems faced by civil servants on contract based employment and to discover possible solutions to the problem faced by the contract based employees. Employment of contract based employees are sometimes needed due to the fact that the service is needed only for a short duration and nature of work are more suitable by contract employees and not permanent employees. The findings show that employees under contract of service are granted benefits similar to permanent basis employees such as medical allowance, housing allowance and travelling allowance. The denial of such employment benefits are only for employees whom are under for service

    Deceptive Opinions Detection Using New Proposed Arabic Semantic Features

    Get PDF
    Some users try to post false reviews to promote or to devalue other’s products and services. This action is known as deceptive opinions spam, where spammers try to gain or to profit from posting untruthful reviews. Therefore, we conducted this work to develop and to implement new semantic features to improve the Arabic deception detection. These features were inspired from the study of discourse parse and the rhetoric relations in Arabic. Looking to the importance of the phrase unit in the Arabic language and the grammatical studies, we have analyzed and selected the most used unit markers and relations to calculate the proposed features. These last were used basically to represent the reviews texts in the classification phase. Thus, the most accurate classification technique used in this area which has been proven by several previous works is the Support Vector Machine classifier (SVM). But there is always a lack concerning the Arabic annotated resources specially for deception detection area as it is considered new research area. Therefore, we used the semi supervised SVM to overcome this problem by using the unlabeled data

    Multi-modal classifier fusion with feature cooperation for glaucoma diagnosis

    Get PDF
    Background: Glaucoma is a major public health problem that can lead to an optic nerve lesion, requiring systematic screening in the population over 45 years of age. The diagnosis and classification of this disease have had a marked and excellent development in recent years, particularly in the machine learning domain. Multimodal data have been shown to be a significant aid to the machine learning domain, especially by its contribution to improving data driven decision-making. Method: Solving classification problems by combinations of classifiers has made it possible to increase the robustness as well as the classification reliability by using the complementarity that may exist between the classifiers. Complementarity is considered a key property of multimodality. A Convolutional Neural Network (CNN) works very well in pattern recognition and has been shown to exhibit superior performance, especially for image classification which can learn by themselves useful features from raw data. This article proposes a multimodal classification approach based on deep Convolutional Neural Network and Support Vector Machine (SVM) classifiers using multimodal data and multimodal feature for glaucoma diagnosis from retinal fundus images from RIM-ONE dataset. We make use of handcrafted feature descriptors such as the Gray Level Co-Occurrence Matrix, Central Moments and Hu Moments to co-operate with features automatically generated by the CNN in order to properly detect the optic nerve and consequently obtain a better classification rate, allowing a more reliable diagnosis of glaucoma. Results: The experimental results confirm that the combination of classifiers using the BWWV technique is better than learning classifiers separately. The proposed method provides a computerized diagnosis system for glaucoma disease with impressive results comparing them to the main related studies that allow us to continue in this research path

    Multi-classifier system for authorship verification task using word embeddings

    Get PDF
    © 2018 IEEE. Authorship Verification is considered as a topic of growing interest in research, which has shown excellent development in recent years. We want to know if an unknown document belongs to the documents set known to an author or not. Classical text classifiers often focus on many human designed features, such as dictionaries, knowledge bases and special tree kernels. Other studies use the N-gram function that often leads to the curse of dimensionality. Contrary to traditional approaches, this article proposes a new scheme of Machine Learning model based on fusion of three different architectures namely, Convolutional Neural Networks, Recurrent-Convolutional Neural Networks and Support Vector Machine classifiers without human-designed features. Word2vec based Word Embeddings is proposed to learn the best word representations for automatic authorship verification. Word Embeddings provides semantic vectors and extracts the most relevant information about raw text with a relatively small dimension. As well as the classifiers generally make different errors on the same learning samples which results in a combination of several points of view to maintain relevant information contained in different classifiers. The final decision of our system is obtained by combining the results of the three models using the voting method
    corecore